Evaluating Robustness for 'IPCRESS': Surrey's Text Alignment for Plagiarism Detection
نویسندگان
چکیده
This paper briefly describes the approach taken to the subtask of Text Alignment in the Plagiarism Detection track at PAN 14. We have now reimplemented our PAN12 approach in a consistent programmatic manner, courtesy of secured research funding. PAN 14 offers us the first opportunity to evaluate the performance/consistency of this re-implementation. We present results from this re-implementation with respect to various PAN collections.
منابع مشابه
Guess Again and See if They Line up: Surrey's Runs at Plagiarism Detection Notebook for PAN at CLEF 2013
This paper briefly describes the approaches taken to the two subtasks of Source Retrieval and Text Alignment, in the Plagiarism Detection track at PAN 13. For the first of these, we reuse our PAN12 approach – which combines frequency and a contrastive corpus measure to select keywords for querying the ChatNoir search system; for the second, we reuse software that had previously featured in PAN1...
متن کاملOn Supply Chains, Deperimeterization and the Ipcress Solution
This paper offers an overview of the dual challenges involved with protecting Intellectual Property in the distribution of valuable business information that needs to be read and acted upon by others. We introduce IPCRESS as a means to engender trust in an inherently deperimeterized supply chain likely acting through the Cloud, discuss the context and requirements of such a system, and in relat...
متن کاملDeveloping Bilingual Plagiarism Detection Corpus Using Sentence Aligned Parallel Corpus: Notebook for PAN at CLEF 2015
Plagiarism detection is the process of locating text reuse within a suspicious document. The plagiarism detection corpora are used for evaluating plagiarism detection systems. In this paper, we present a bilingual PersianEnglish plagiarism detection corpus. We provide our corpus for the task of text alignment corpus construction in the PAN 2015 competition. Our approach is based on parallel cor...
متن کاملAlgorithms and Corpora for Persian Plagiarism Detection: Overview of PAN at FIRE 2016
The task of plagiarism detection is to find passages of text-reuse in a suspicious document. This task is of increasing relevance, since scholars around the world take advantage of the fact that information about nearly any subject can be found on the World Wide Web by reusing existing text instead of writing their own. We organized the Persian PlagDet shared task at PAN 2016 in an effort to pr...
متن کاملA Text Alignment Corpus for Persian Plagiarism Detection
This paper describes how a Persian text alignment corpus was constructed to evaluate plagiarism detection systems. This corpus is in PAN format and contains 11,089 documents and more than 11,603 plagiarism cases. Efforts were made to simulate various types of plagiarism manually, semi-automatically, or automatically in this large-scale corpus.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014